-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V0.3.0 #418
V0.3.0 #418
Conversation
Regarding the deletion of the random.py file: this is a fix for issue #364. The problem that the inclusion of the random.py file was supposed to address was the issue of reproducibility of Markov chain, and the idea was to set the global random seed to 2018 at any point where we would like to import the random module internally within the gerrychain package. However, this approach also causes the trees that are generated by the tree.py file to be fixed if the user does not set the random seed after the import of the gerrychain package. So an import pattern of import random random.seed(0) import gerrychain print(random.random()) print(random.random()) will output 0.5331579307274593 0.02768951210200299 as opposed to the expected 0.8444218515250481 0.7579544029403025 will actually force the random seed to be 2018 rather than the expected 0. This can often cause issues in jupyter notebooks where the user is not aware that the random seed has been forcibly set to 2018 after the import of gerrychain. Instead, it is best to allow to user to set the random seed themselves, and to not forcibly set the random seed within the gerrychain package since that can affect the execution of other packages and can cause the chain to hang when the 2018 seed does not produce a valid tree. This issue does not appear if we remove the random.py file and instead use the random module from the standard library within the tree.py and accept.py files. This is because of how python handles successive imports of the same module. Consider the following snipit: import random random.seed(0) import random print(random.random()) print(random.random()) This will output 0.8444218515250481 0.7579544029403025 as expected. This is because the random module is only imported once and then places its name in the internal list of imported modules. Subsequent imports of the random module within the same python session will not will simply retrieve the module from the list and will not re-execute the code contained within the module. Thus, the random seed is only set once and not reset when the random module is imported again. In terms of reproducibility, this means that the user will be required to set the random seed themselves if they want to reproduce the same chain, but this is a relatively standard expectation, and will be required when we move the package over to a rust backend in the future.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #418 +/- ##
==========================================
+ Coverage 88.88% 91.88% +3.00%
==========================================
Files 39 38 -1
Lines 1790 1911 +121
==========================================
+ Hits 1591 1756 +165
+ Misses 199 155 -44
Continue to review full report in Codecov by Sentry.
|
Ignore this. The code coverage has gone up compared to base and several tests were added for this release |
This should be the main PR for v0.3.0. Below is a summary of the proposed release notes:
What's Changed
Major updates have been made to all of the documentation to try to make it
(i) More complete
(ii) More accessible for new Python users
Bipartition Tree now has max_attempts default set to 100000 to prevent infinite loop conditions, but this value should also give plenty of time to sample sufficiently from the set of spanning trees.
The
recom
method now has functionality for running region-aware chains which allow for the construction of ensembles of districting plans that try not to split a particular region or set of regionstally_region_splits
has been added to make quick tallying of the number of splits of a region type easyMany of the issue tickets have been resolved. See below for more info.
Deprecations
Resolved Issues
Code from conda install doesn't match latest GitHub code #412 Resolved due to deprecation of conda-forge
Saving and loading Partition object #409 Related to the following issue:
Compatibility issue with shapely 2.x #408 Resolved with deprecation of the conda-forge
gerrychain.random sets random seed globally #364: The global random seed is no longer set in a custom random file. The user must now set the seed manually if they wish for their work to be reproducible.
Node repeats for ReCom should reselect the pairing #319 is now fixed for recom. In
bipartition_tree
the user may now specify the parameterallow_pair_reslection
as true or false. The default is false to maintain backwards compatibility, but in the case where this parameter is set to true, an error is propagated back up torecom
and a new pair is selected there. The selection of the new pair does not advance the chain. See the documentation for more information.Examples of collecting outputs (pandas, xarray) #294 This is now a part of the "Good Data Practices" section of the documentation
Better __repr__'s #288
__repr__
methods have been added across the packageconstraints
for theis_valid
property that allows for the printing and editing of constraints. For example. one may print the constraints with the callprint(chain.constraints)
which will print something like[<function single_flip_contiguous at 0x7f3d80f45b20>]
, and the constraints can be set with something likechain.constraints = [contiguous]
. When set, the constraints are checked against the initial state of the chain. This improves the UX by making it so that you do not need to reinitialize a full MarkovChain object to experiment with different constraints.Plot cookbook #279 This has been partially addressed in the documentation update and the remaining reference materials for how to plot data is left to the documentation of matplotlib and seaborn
Tutorial on contributing a function for validity, next step, etc. #44 This is now a part of the documentation.